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ABST^CT 

This report illustrates one way in which the 
technologies of testing might combine with cognitive science 
techniques to help steer instruction. Steering testing is brief 
diagnostic testing that steers, or individualizes^ the course of 
instruction. Steering testing uses simple heuristics for reasoning 
about the level of a student's competence In a particular subsklll 
and intelligently manufactures practice opportunltias that will be 
especially revealing about the student's current competences. 
Theoretically, steering testing should permit a partly logical 
constraining of diagnosis and should be based on a representation of 
the knowledge needed to eKercise the skill it purports ^ ^ measure. 
Four types of knowledge^ Involved in dealing with a student, need 
clarification when designing computer systems for steering testing: 
(1) domain expertise! (2) curriculim knowledge; (3) planning 
knowledge; and (4) treatment knowledge. In addition, a student model, 
a knowledge structure specifying which subskllls a student is thought 
to know or to not know, is embedded in the curricular goal structure 
of the system, when a diagnosis is needed, the student model is 
examined to Identify areas of competence about which more information 
is needed* These areas represent constraints on the type of test item 
that will be informative. Once the constraints are posted, an 
Intelligent item generator constructs test items that satisfy them. 
To Illustrate these ideas, an Intelligent computer-based tutor, with 
a problem solving mode, that teaches basic electrical principles Is 
discussed, (BS) 
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course of instruction. When a diagnosis la needed, the student model (I.e., what is currentlj 
known about the student's comp.tenc.) Is examined, and areas of oompetenee about which more 
Information IS needed are identified. These areas r.present constraints on the type of test 
™^ J informative. Once the constraints ar« post.d, an intelligent item generator 

construct, a test Item that satisfies them. We discuss steering Instructlol in two Intell" ' 
gent computer-based tutors for adult skills: avionics troubleshooting and DC=circult " 
understanding* 
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One of Robert Glaser s special contributions to psychology and education is the concept of 
criterion-referenced testing (Glaser, 1983). While norm-referenced testlnf supports decisions that 
involve choosing among people or otherwise comparing them, criterion-referenced tests tell us 
something about what people know or what they can do. In introducing the concept Glaser was 
begmmng a long advocacy of adaptive education, of shaping eduoation to each person's current 
competences rather than choosing to educate only the people who score highest on general tests. 

While this was his goal, most work on criterion-referenced testing (cf. Hambleton 1984) has 
focused on issues relating to certification, to setting of standards for educational outcomes, and to 
tractang, that is, on selection more than on adaptation. There are a number of reasons for this but the 
situation can be summarized as follows. Adaptive education is a steering process. Norm-referenced 
tests are designed to indicate reliably who is out in front; criterion-referenced tests are designed to tell 
us exactly where each person is; but knowing where you are is not the same as knowing how to steer a 
course toward a planned destination. m 031.™ a 

The purpose of this chapter is to illustrate one way in which the technologies of testing might 
combine with certam cognitive science techniques to help steer instruction. We focus on steerine an 
intelligent tutor, i.e., on student modeling. However, the approach can be generalized to other 
instructional forms, including reactive environments (exploratory micro worlds) and perhaps even 
conventional classroom instruction. We are discussing diagnostic testing to be used often in small 
amounts, to steer the course of instruction. Further, in contrast to relatively standard ( e s 
pretest-treatment-posttest) designs for individualizing the teaching of children we focus on 
mdividuahzing the testing process to make it more efricient in steering instruction. 

Problems of Diagnosdc Testing 

Any test, including a diagnostic test, consists of a number of items. The person being tested 
carries out some performance of each of the Items, scores are assigned to those performances and those 
scores are aggregated to arrive at an evaluation. To make steering tests, we need test items that are 
relevant to the spec^ic steering decisions that must be made about a particular student in a particular 
context and we need procedures for scoring performance on those items. Steering tests must be 
efficient to administer, since steering requires frequent, but not necessarily precise, feedback (given 
the mertia of teachmg and learning, the steering error produced by believing an imprecise test will 
probably be canceled out by subsequent course corrections). ^wLescwiii 

methods are not designed for steering tests. They are designed to assure 
that differe^ntforms of a test are equivalent and that the scores on that tost are reliable The problem 
Th'?™"E' ?i '""f be brief- so that testing does not take too much time from learninf 

S^to i^^^ ^' ^"^^ reliability of 

There are two ways a test can be made more reliable. The first is to increase the extent to which 
performanre on its items directly reflects the skills one wishes to assess. This can be done statisTSallv 
or substantively. Statistina annt.n9,.hao b„„u 1*™ ^1 ,t , stacisticaily 




on nrW llmt^^ H™^^ fi, ® I ^ test more reliable is to use knowledge about the student's performance 
on prior itejns to limit the information each new test item must provide. Adaptive testing algorithms 
have been developed for this purpose. They use a sequential strategy. After the student fompleies an 
Item, an estimate of the student s performance based upon the items so far completed is used to select 
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the most informative next item to administer, and then the score on that next item is used to update 
the eitimate. The adaptive testing approach, which almost always requires a computer for the 
real-time eetimates just mentioned, can be applied even when nothing more than the dlfflculty 
ordering of items ia known. However, it is especially powerful if more detailed information about the 
items is available. Again, a theory that relates performance on various test items to underlying 
competences and their acquisition can be helpful, even if it is incomplete. 

In at least one case, adaptive testing techniques were applied to diagnostic testing (Spineti & 
Hambleton, 1977). Spineti and Hambleton used learning hierarchies specified by rational task 
analysis (Gagne, 1966) to help constrain the estimation process. That is, they decided on items 
according to an analysis of the material being learned and to some theoretical predictions of the order 
of acquisition for parts of that materiaL Doing this, they were able to achieve a 50% reduction in the 
number of items required to achieve a given level of score reliability. 

The approach we have taken to steering testing is somewhat different. It usee very simple 
heuristic for reasoning about the level of a student's competence in particular subskills. Its power 
derives primarily from its ability to intelligently manufacture practice opportunities (test items) for 
the student that will be especially revealing about his current competences. We believe, although it 
remains to be proven, that these practice opportunities are generally appropriate learning vehicles as 
well as test items. In that sense, wa are pursuing steering as a unified system in which testing and 
learning are combined. 

In our view, a cognitive theory of testing, and especially a theory of steering testing, should have 
two characteristics. First, it should permit a partly logical (in contrast to a purely statistical) 
constraining of diagnosis. Second, it should be based on a representation of the knowledge that is 
needed to exercise the skill it purports to measure. The logical approach is not at all foreign to our 
experience. When one is sick and goes to a physician, one is not satisfied with broad probabilistic 
statements. Rather, one expects a diagnosis constrained by the physician's knowledge of disease. 
More speciflcally, we expect the physician to be asking herself what diseases could produce the overall 
complex of symptoms and signs presenting themselves to her. Diagnosis in medicine, then, is the 
designing of a personal theory of a specific patient's patholo^. This personal theory is rooted in 
theories of disease mechanisms and not just in unexplained statistical relationships. 

The diagnosis process is dynamic. For example, based on the hypo thesis that a patient has heart 
disease, the physician may probe for more explicit detail about certain symptoms or order a test that 
may_confirm or refute her theory. A teacher does this too when prior knowledge about a student, 
combined with current observations, leads her to attribute grammatical errors in the student's paper 
either to inexperience with written language or to use of nonstandard dialect or to a mistaken sense of 
when formal conventions are needed. 

The good teacher's diapiosis differs from that of a physician in one respect, though. We come to a 
ph^^ician to get a diagnosis when something is wrong she does not generally shape continuing 
decisions about how we should act (except perhaps in developing special regimens, e.g., diets for 
control of diabetes). A teacher, in contrast, is carrying out an active, goal-directed activity teaching 
y which needs only small course corrections. Consequently, it l ns reasonable to conduct the testing 
from the teacher's point of view, at least in part. 

We would like to produce tests that capture some of the capabilities of the most perceptive and 
observant teachers. We want them to be driven mostly from the teacher's goal structures for teaching 
but also to respond to knowledge of the expertise the teacher is trying to convey, the treatments 
available to the teacher for effecting learning, and certain more global teacher concerns, such as 
adapting to general dUTerences in aptitude and general characteristics of competence at different 
levels of leairning. 

In the next section, we discuss the different kinds of knowledge that are needed to adapt teaching 
to an mdividual student's course of learning. We take the viewpoint of intelligent tutoring system 
desi^, but the same concerns arise in all approaches to instruction. This is followed by sections in 
which a speciflc approach to the generation of diagnostically and educationally useful problems is 
diicuss^. 

Components of Teaching and Tasting Knowledge 

^veral different kinds of knowledge are required in our approach to steering testing. Especially 
when designing computer systems to teach or to test, it is important to clarify the knowledge, or 
competence, that is involved in dealing with a student. We have ^egorized that knowledge 'into four 
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fcypes. i These are domain mpsrtise, curriculum knowledge, instr uct L>*,^i n g kmrv ?ledf e» and 
trmtment knowledge. Each type of knowledge has different struH- ^^ras , d^ Jet vL t ^fnnatmiiied methodi 
and different purpose and applicability. Further, there are a vaim.: / of cortnoc^ ns frota one type of 
knowledge to another. Figure 1 shows these four categories wit.i ^^^^ v ipk ^ :s kinniti of knowledge 
they contain for an electricity tutoring/testing system under de-ve r m \ . mt mt ai ' eai ing Research 
and Development Center. 

Domain Expertise 

Domain expartise is always embodied in instructional d^c/ iiuA maikin^ either explicitly or 
implicitly. Deep diagnosis of student difficulties may requim arn esi^Iifti^t p^prw^ntation of the 
knowledge required for the performances that are the goals of - l Vo^r example, the ability of 

a computer-based tutor to diagnose bugs (systematic errors) iFa^a^aildi: ei -s antrnmetic performances 
requires havinf a model of the algorithms that experts use in ^ifutmu ►ho&e performances. Also, 
feedback on test performance and advice to the student may have f r coti^ed in terms of procedures 
for acting rather than in terms of criteria for outcomes One way or 

another, the performances that constitute the goals of a currictilum derive from information about the 
competences that constitute expertise. 

Another aspect of domain expertise that is important in instructibn and testing is knowledge of 
the target task environment. When we speak of what it is we want people to do, we are referring not 
only to the knowledge they need to perform successfully but also to the circumstances under which 
that knowledge must be employed. Again, knowledge of these circumstances might be the basis for 
curricula objectives^ but those objectives rest upon domain expertise. If wa have the objective that 
given situation X, the student can do Y, it reste upon knowledge of what kind of situation X is and how 
Y can be done in X. For example, a student might be able to solve a proportion problem at the time a 
lesson on proportion is presented but not be able to use that knowledge later in solving a word problem 
or even to solve the same problem as one of a set on mixed topics. When testing or teaching is done by 
a computer program, the underlying domain knowledge sometimes must be made explicit. 

GurriauIurA Knowledge 

Curriculum knowledge Is the specification of the goal structure that guides the teaching of a body 
of expertise. Educational researchers and developei^ often treat the procedures that constitute 
expertise and the Instructional goals that eonstitute curriculum as more or less the same. They 
assunie that expertise can be split apart easily "at its joints" (to use Plato's phrase). The curriculum, 
then, is a natural hierarchy of goals and subgoals to teach the natural units of expertise. From this ' 
viewpoint, curriculum knowledge and domain expertise are the same thing. However, It appears that 
there are many different plans for splitting apart expertise, especially when expertiie involves 
aomplex perfbrmances. For example, consider the curricular issues that arise in teaehing simple 
tlectrical principles. There are some basic concepta - voltage, current, and resistance - and iome 
basic laws Klrchhoflrs Laws and Ohm's Law. In addition, there are different types of circuits series 
and paralleL 

So,^one leptimate decomposition of the subject might begin with voltage, teaching the behavior 
of voltage in series and parallel clrculte, then teaching about resistance in the two types of circuits, 
and finally treating current. Another decomposition might, with equal leptimaoy, build the entire 
curriculum on Kxrchhoffs current laws. Yet another view might treat parallel circuits as being quite 
distinct from series circuits and redevelop the concepts of voltage, resistance and current separately 
for each. We need to capture these multiple viewpoints ff they correspond to different curricular goals 
about which steering iirformation may be needed. For this reason, the various subgoals of knowledge 
that the teacher or curriculum writer can have are best represented by multiple hierarchlMl goal 
structiires; these goal structures overlap in the components of expert performance to which they refer. 

^ Once we concede that instructional goals are not really a simple decomposition of the expertise 
being taught into discrete sets and subsets, we are in a position to understand why some testing that is 
part of a cumculum may not be as diagnostic as we would hope. Specifically, we can understand why a 
student might demonstrate clear competence on a curricular goal that is prerequisite to some other 
goal but still appear, from the standpoint of the teacher of that second goal, to not have mastered the 
first. For example, a student may demonstrate understanding of Kirchhoffi Current Law but fail to 
apply it in a circumstance for which it is relevant. Separating expertise from curriculum allows us to 
understand such situations better. 
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Suppose that we consider domain expertise to be repreientad by a iurface. Expert knowledge is, 
after all, highly interconnecfced. Even if it is properly represented as iome kind of network, it can be ' 
approximated by a continuous surface (speciflcally, a manifold of unspecified dimensionality). We 
start by assuminf that each curricular subgoal corresponds to a region of the expertise continuum. 
The expertise subset corresponding to a curricular goal will likely be convex, in the sense that if t wo 
pieces of knowledge are part of the same curricular goal, then any strong relationship that directly ties 
them together should also be part of that goaL On the other hand, a curriculum goal's corresponding 
expertiie is not a completely closed set, since concepts it subsumes may well have connections to other 
knowledge that goes beyond the goal. That is, the edges between the expertise subsets corresponding 
to dUTerent curricular subgoals are not necessarily clean edges with no connections to other 
knowledge. 

The un targeted knowledge lying between the clusters of expertise directly addressed by the 
curriculum can be important in remediating lack of transfer from a curriculum goal's prerequisites to 
the final target capabiUty.2 Ordinarily, instruction is directed at the center of the expertise subset 
corresponding to a curricular goal (see Figure 2). This helps keep the new knowledge to be taught 
Simple enough to be learned. However, this approach can sometimes backflre. For example. If two 
bundles of expertise are both curricular goals, their centers may be well taught but their peripheries 
ignored. For example^ I may teach you how to compute the joint resistance of two resistors in series, 
and this may satisfy an Instructional objective. Later, if you need to find the joint resistance of three 
resistors in order to solve a problem, you may be able to do that or you may not. In either case simply 
reteaching the two-resistor algorithm will be iniufflcient. ' 

If a higher-order curricular goal happens to depend upon the integration of the two lower-order 
subgoals, it is exactly the edges of their domain knowledge subsets on which it will likely depend/ For 
decisions about what to teach when remediation seems necessary and also for decisions about how to 
intefpret apparent inconsistencies in dia^osing whether a curricular subgoal has been achieved, 
domain expertise may be needed. ' 

Planning Knowledge 

In addition to specific cur rlcular goals, there are some other higher-order curricular issues that 
need to be addressed in planning testing or teaching. Often, these are abstractions from, or specialized 
viewpoints on, the curricular goal structure. These may Include learning skills, problem solving 
heuristics, rather gennral aptitudes, and even preferences. These concerns, e.g., the more general 
•"mquiry" skill goals in a science course, overlap some of the higher-level goals In the curriculum It 
could even be argued that these concerns really are part of the curriculum, but we retain the 
distinction since planning issues often color the exact form thai; goaUspecific instruction might take. 

For example, we would treat as a planning issue the eomplexity of arithiaetic computation that is 
required to solve a word problem in a math course. The metagoal is for the student to be able to 
advance through the problem-solving part of the curriculum even if his arithmetic skills are 
developing more slowly than his problem solving skills. So, the arithmetic required In a word problem 
might be adjusted to keep it simple enough to let new problem solving skills develop. Later, when 
problem solving skills are strong, the situation might reverse, and increasingly tough arithmetic 
might be required whenever the student is predicted to find the problem solving tasks easy. Note that 
the issue of arithmetic skills getting in the way of problem solving could arise in cnrricula other than 
math, such as the electrical networks curriculum sketched in Figures 1 and 3. It is for this reason 
especially that we choose to treat the matter as a metacurricular planning iBsue. Sometimes 
capability on skills that are not the focui of Instruction v411 require alteration of instructional and 
testing strategies for target skills. This is why Instruction and testing systems need planning and 
metacurricular knowledge. 

The planning of teaching must wAm take into account the long-term, higher-order aspects of 
education: metacopiitive skills, mature and flexible preferences, and fundamental principles that 
apply in many domains. From the ^int of view of the steering test developer, though, these 
higher-order Issues represent, for the most part, variables to be controlled. We can't really understand 
whether a student knows how to solve electrical network problems, for example, if his capability is 
hidden by slow arithmatie performance. So, we have to take account of metacurricular issues in 
selecting problems for instructional or measurement use. That is, problems can be selected to require 
domain-specific skills but to assure that the student answering a given problem will not be troubled by 
weakness on general basic skills that are not the current focus of measurement or instruction. For 
example, tf a student is weak in arithmetic, a problem might be generated that required only 
small-integer arithmetic. If a different student finds it easier to receive information in graphical form, 
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. the information given for a problem might be presented via a diagram, graph, or even photeraDhic 
image. r » ^ 

Treatment Knowledge 

We turn now to the matter of educational treatments and test item development Evan when we 
know what to teach or what to measure, there remains a separate form of expertise involvdin 
successfully f enerating a situation in which a piece of knowledf e can be exercised. For example 
several difTerent types of problems can be created to test understanding of electrical network ' 
prmciples (or to provide opportunities for coached practice). Problems can be quantitativeor 
qualitative. They can deal with unchanging situations or can focus on relative changes iBdlfferent 
measurements of a circuit. Since electricity knowledf e must be applied in slightly difFerentways for 
each type of problem, we could treat problem type as a curriculum issue. However, the krowlidge an 
intelligent systera needs about problem categories is different in form from knowledge about 
curricular goals. This is especially the case when we want to develop problems for practieeorfor 
steering tests that require integrated use of several different skill components that are separate 
currieular goals. The knowledge needed to develop such problems is speciilc to electricity and to the 
teaching of electricity. 

Practice and testing that requires multiple skills to be combined is an important goalofour work 
A contrastingapproach IS taken in some formal instructional development methodologieasuch as the " 
Defense Department's ISD (Merrill & Tennyson, 1977) approach, As generally used, that approach 
consists of complete development and elaboration of the curriculum followed by the developmint of 
tests and treatments corresponding to each curricular goal. This seems entirely sensible anextensi 
ot a management-by-objectives approach. However, if this method is applied superficially, dricultii 
cm anse. We nave already discussed the problem of too-narrow focusing on core concepts without 
adequate elaboration and qualification, but there are other, related problems as well For imaple a 
variety of apprenticeship situations involve simultaneous practice of a wide range of skill components 
only some of which may be the current targets of instruction. When practice is provided oneich skill 
component separately, without attention to when each should be used and how they tie togither 
fragmentary learning results. The instructor can show, on academic-style tests, that the student 
learned each subskill that was to be taught, but the subskills cannot be put together to solve 
real-world problems. 

This, of course, is a viewpoint that has been taken before. In the world of reading instruction for 
example, we have just seen a long period in which holistic approaches have been taken Sinillarly ' 
case study approaches to the teaching of medicine and business are driven by the same motivation 
Ihere IS, of course, some evidence against holistic approaches. Per example, Chall (1967) surviyed a 
number of reading curricula and found that, on average, weaker students benefited from aphonics 
approach, m which recognition of each individual grapheme was the focus of separate instruction In 
the professional world, it is regularly asked how we can be sure that a student who took a casi study 
course rea ly learned everything he should have. "What if I get a disease that was not one ofthe cases 
aiscussedr 

We can be a bit snore formal about this problem if we view Bubskills as productions actions to be 
performed under specific conditions. When subskills are taught in isolation, the conditions under 
wmch they should apply cannot be specified, since those conditions relate to the broader coiitext of 
holistic prformances. Also, there may be specific productions that are not represented as submals for 
instruction but that are the 'glue" needed to combine the productions that were direct curricular 
targets. 

An instructional synthesis ofthe holisUc and componentlal approaches requires several things 
including an understanding ofthe circumstances under which new subskills or concepts should be ' 
introduced isolation even if they are later to be practiced more holisticaljy. Of course, themlislng 
productions, the glue that holds together the subskills we target in our curriculum, eannotbo taulht 
adequately m vitro; they require holistic instruction. The dilemma is that they also need to bo 
Msessed. We may need to help students attend to "gluing" their fragmentary knowledge together if 
they have trouble domg so on their own. Further, we may not always choose to introduce newpleces of 
toowledge formally and aj^licitly, hoping that they will be inferred through rich domain experience 
If we take this approach, which may be very efficient, we need to be able to assess later whether there 
are any subgoals that were not well attained. " " 

it t The basic approach we have taken Is to generate test items (and instructional treatmefllsior 
that matter) in the course of tedting. That is, at any given point In the course of testing, if a qusstlon 
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arises about a speeifie curricular goal, a test item is generated for itkyin intelligent subsystem of the 
tutoring prop^am we (primarily the second author) are developing. The UenM ean be ihape'd by 
metacurrieular considerations. Further, if multiple skills are reqniWfor at— ly realistic performance 
within the domain, sate of items can be developed over vyhich parWciikf subslsfcill requireinents are 
systematically varied. 

So, our appfoach, pven a family of cognitive analyses (of e^piarliii, met^^curricular issues, and 
problem environments In which the expertiee can be manifested ojpfaoticedSD, is to intelligently 
generate the equivalent of a controlled experiment in which the nmiikr vari=aous target pieces of 
knowledge is systematically varied. If the itudent fails to perfortttlleinB req^^iring a piece of 
knowledge but does perform other items that do not require it, th^nwelirfer Othat work is needed on 
that knowledge. Further, we ask only about pieces of knowledge yitara in fcihe part of the curriculum 
through which we are steering. Finally, rather than make statistiddicisioDis about whether a piece 
of knowledge is present or absent, we assume that knowledge can lipriient ^mt various strength levels 
and use experience about the reliability with which a particular pieiof knov^svledge manifests itself to 
specify the level of learning of that knowledge. 

Summary. Perhaps the best way to illustrate the ideas just ^mmiBd is ^ to refer back to the 
example given above. Fi^re 3 elaborates the knowledge categorl^Jnpart, ^ Jbr our system to teach 
and test basic electricity principles. The curriculum knowledge tt^Sm thrfe^e sets of goals: laws, 
concepts and architectures. Under each of these are subgoals. Fo^pmple^ tl^he architectures being 
considered are series and parallel circuits (i.e., no bridge circuits). TkeplannLming knowledge includes 
two seta of planning concerns: the arithmetic difficulty of problem^llat art presented to the student 
a^d the circuit complexity. Both apply with respect to a variety of ciricular^^^ubgoals. For example, 
circuit complexity may affect whether a student can handle parall^idreuitg, '^^hether he can apply 
KirchhofTs current law, etc. Arithmetic difficulty could also affect ton sub^^oals, especially if 
quantitative problems are presented to the student. The treatmeniiiiQwledg^© includes information 
on problem formats and feedback to the student. Finally ^ the domelyxperti^e contains specific 
details of expertise in handling electrical networks that are refereind by the curriculum 
specification. 

GenariitfiigTest Items froni a StudekitModel 

Having described the architecture of the knowledge in a steeriii|teitinr T system, we turn now to 
how one uses that knowledge to do assessment driven by a cogniti^imodel of ^ the target capabilities 
being taught. We offer as a first approximation an approach tLat huaitain tested in prototype form in 
an intelligent tutor. It assumes additional knowledge that we havegyat discussed: a student 
model, some sort of knowledge structure specifying which subskiilstliiStuder^t is thought to know and 
which ones not. 

We currently specUy the student model by embedding it in thecumeula^^^ goal structure of an 
intelligent tutor. For each curricular subgoal, there must be some sofiof notation about the student's 
assumed competence relative to that subgoal. In one tutor the first aii!hor ancH his colleagues are 
building (Lesgold, Lajoie, et ah, 1986), there are only four notation^ •Urt/isrn&^d, perhaps acquired, 
probabfy acquired, and reliably stror^. These notations relate to aiay^dirlyin^g cognitive model of' 
learning derived from John Anderson's (1983) work. The rules eurceiitly used ^ to change a subskill 
notation from one state to another are quite rough, but they are priadplid. 

Movement to thm probably Imrned state implies that a correct jfflJuction^^ or set of productions, is 
assumed to have been developed by the student. ThB parkaps state inJimtes tl=at the student has been 
observed to perfom the target skill component, but that there is inailint e^rldence to conclude that 
he knows the conditions as well as the actions for the subskllL Thfi jifcpfst^ftte is unstable. Either 
further correct performances will occur, prompting classfflcatlon to ikprobab^*£y state, or we will 
assume that the single correct performance observed was accidental relative tc» the problem ecology for 
the curriculum, and the student will be moved back to the unlmarmdMB, Riacurrent reliable 
performance will move a student from probably to siror^. One can Imipe othDar approaches in which 
the notations might include indicators of misconceptions as well. ThdmportaMit point is that if we 
look in on a student who is in the midst of learning a skill, some of theiubskill^ will be clearly 
demonstrated already, some will be manifesting obvious problems, soma will unlearned, and some 
will be in an unknown status. 

we consider how to diapiose student progress in a holistie pracllcs inviB»onment given a 
current student model state, we see that a first issue to bf addressee hhat to ^est. In principle, the 
student could have learned anything since we last tasted bim or her. For that i^»atter, any prior ' 
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demonstratwn of compBtencemi|ht have been a fluke, . soall po^sitive entries in the student model are 
tentative. Nonetheless, it would make no use of the stiadent in^del at all if we merely tested for every 
s^ll coraponejit at every opportunity. The student nno^glenabnes testing for selected skill components 
efliciently'and m Beahstic performance contexts. It ig t^sequiv^alent for steering testing of the 
patient 8 chart for medical diagnosis. o 

We want to use the studintinodel to generate cont^tralnts on the problems we pose to the student 
as testitems. These constraiiitiihould have the property that Shey make the items maximally 
i^ormativB m tuning the studenlmodel to changes iflt«:liiitud&iit's capabilities. What can guide our 
choices of curricular goals to test? There are several jsos-asibllitl^s. We discuss them in terms of the 
tour-level model of acquisition mintioned above (Unie^md, Perhaps learned, Probablv learned and 
strong). The Perhaps stage nniyba the most volatile. Suppose « curricular goal to be the attainment 
ot a speciiic production (carryin|out a particular action whin a-rppropriatej. When the action is 
initially performed and is succiisful, there is a conside«-abl8 ehmuice that the student may not notice 

the moot important cues about ihi circumstance of the «noment So, he/she may be unable to 

demonstrate th^ production uiolhircircuinstances. roBa-allpra-rfitical purposes, it was never really 
learned at all. Till we have several demonstrations of tHie attainment of a curricular goal we must 
assume that our assessment of the student is unstable. Once w= see multiple successful performances 
we will reclassiiy the student's competence to the Prob^hly leveH. So, a first principle in selectinB ' 
current curricular goals to tistlilo be sure to check upa-ongoals in the Perhaps state. 

A second issue has to do* prerequisite skills. ItJSWU depends upon SAt« S, then there is no 
point in regularly testing for Juntil B is demonstrated, . Pulanother way, if there is ordering 
information about the curricular |oals, we may want to eoneent»ate testing on the region in the 
ordering between the goals in thi Strong state and thosae In the Unlearned state, testing most often the 
Perhaps goals, checking for propss on the next few Vn^mrmd rtoals, and checking occasionally to see 
If any goals have gone from PrM^ to Strong (operatiQanaliy, w-m check to see if problems requiring 
this aubgoal s skills are answerid correctly for severaj c wniicutl-!i-ve occasions with varyine 
requirements). 

# J w "® involves metacurricular concerns, tespeciallsr those relating to extraneous sources 

of difficulty, such as requiring complicated arithmetic p««rformasBace. presenting information in a 
medium known to be difHcult forthe student, etc. The b«sicrulte of thumb we propose is to adapt these 
difficulty variables to the currentstudent model level. ^orexaiMple, if the goal is to detect a 
mm^ement from U nlearned to^ Arkps for some currlculMr goal, tehen we want to set the metacurricular 
difficulty levels low, so that thelnitial weak acquisition, ofthat s=!ubgoars knowledge Is not masked by 
too many other demands for processing capacity. ForwDviment from Perhaps to Probably an 
appropriate problem constraint iUo have some situatiotnal changes from the problem in which the 
initial appearance of the relevant knowledge was first nested, sin^e the theoretical motivation for the 
distinction is the possibility of the correct actions having biin lii=ked to imprecise conditions For 
validating movement to StrongismmB goal, there shoui_ldb8ad-.«monstration of the relevant 
capability under more difficult circumstances, since the •«[ueitioiMi. is whether the relevant knowledge is 
robust enough to occur even unaer adverse conditions. 
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The Cor^cept of Constraint Posting 

basic approaeh is to begin each cycle of diagnosis by sweeping through the curricular goal 
itructur^, noting which subskills are "ripe" for testing. When the sweep is completed, we try to build 
one or m^sre problems that maximize our chances for accurately notlnf changes in the student's 
eurrentl=nowledge state, using gome of the rules of thumb Just described. We then use performance on 
thase w^^e-to-order problems to decide how to update the student model we make a diagnosis. 

Criteical to the approach is the concept of can&iraim po&Hng (Steflk, 1980). ^Rather than building 
lest iteni^ as we sweep through the curricular goal str ucture, we instead simply add to a list of item 
wnstraj^mts as we proceed. Each time we see an issue on which we would like more clarity, we post 
that concern as a constraint on the test item generation process. When the sweep thrnugh the 
curriculiDm is complete, we take the bundle of constraints and try to build items that satisfy them. 
Stiflk (l^SSO) has shown that in many complex problem solving tasks involving multiple sources of 
complisi^y and interactions between problem aspects (e.g., designing recombinant DN A experiments), 
this coni^CTaint posting approach is much more efficient than piecemeal search processes. 

Oonatra^^t Posting Applied to Problem Generation 

The item, generation process, then, can work as follows. We first consider the student model. 
hmB of t^he subskills may be marked as reliably strong. These represent beachheads in the conquest 
ofignoraE^ce. From these beachheads, as we venture out toward related subskills, we flnd some whose 
3titus ta^ancertain (subskills that may or may not have been acquired yet and acquired subskills that 
my or Bi^my not be reliable yet). We can make this search process more efficient if we know, for some 
subgoals, v^hich other subgoals are prerequisite to them and whic A 
subgoal f^»r which a Just attained subgoal is prerequisite is likely to be a testing target, but we will also 
five som^ weight to all subgoals, using the rules of thumb discussed above. Since we are making 
stoenngJI^aclsions, we focus on the area of the curriculum that is currently the object of instruction. 
For eacji^iibgoal that is a current target of testing, at least one constraint is posted: a test problem 
mast add g^e ss that subgoal. For example, if we want to find out whether the student*s capabilities in 
applying (^>hm's Law to series circuits have improved, we post constraints that the problem must 
riquire O^Sim's Law and must involve a series circuit. 

^must also consider metacurrlcular planning issues. For example, a part of the system's 
planning «=omponent may address the question of whether or not a physics student has adequate math 
tahty, m whether or not a student is able to learn information from graphical presentations. 
Constralii^s can be posted based on metacurricular aspects of the student model, too. We may, 
pentiall^Br, say to the test generator, "Since this student is poor in arithmetic, I can't find out if he has 
learned (^^oved from unlmrned to perhap&) how to use Ohm's Law to compute the current in a circuit if 
h arith^saetic comes out messy, so make the numbers come out simple," 

Onca^- the sweep through the curricular and planning structures is complete, the posted 
coiiitrain^g must be analyzed before test items are generated. Are there too many to handle at once? If 
10, we pd gBi t partition them into several cluster. Are the constraints Inconsistent, in the sense that a 
problem e^nbodying some of them cannot, in principle, embody the others? For example, if we 
constrain ^an electricity problem to be simple and we want to know both whether a student knows how 
lodeal \vit3i two resistoi^ in series and also whether he knows how to deal with two in parallel, this 
mmt all ^fce done with one circuit problem. So, again, we might partition the constraints into bundles 
(lilt can c»»mfortably be handled. 

Fina^3y, one or more holistic problems that satisfy the constraints posted must be posed. From 
prfbrman^ce on a problem, either a diagnosis can be made immediately or a more focused problem can 
kespeciae^d for further testing. In essence, we are dealing with a qualitative process that has many of 
tkepr©p©r^&es of one of psj^hometrics' most important quantitative procesies -- adaptive testing. 

k Ezam^pla from a Tutor for Baaio Eleetricity Principles 

To ilMtistrate some of these ideas, we descrilM MHO, a tutor that teaches basic electrical 
principles Ccurrent, voltage, and resistance; KirchhofTs Laws and Ohm's Law). MHO is designed to 
mtk in botah a problem-posing and an exploration mode. In the explorato^ mode, the student can 
mikt mea»urementi on circuito and even build his own circuit. In the didactic mode, though, MHO 
nwit decide what problem to present to the student. Thus, it faces the same problem, that a testing 
[rogrrai vf^ould face: to examine the student model and determine which problem to pose to optimize 
la inforBi^ation value of the student's answer. 
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1 ? Student model li a specialized form of checklist; a goal structure for teaching the specific 

knowledge It wants to teach. The checklist derives from the curriculum and planning issues shown in 
Figure 3 above, f or each subgoal, the student is marked as being in one of the four states described 
above, as shown in Table 1. Quantitative scores could be entered as well. What is critical Is that some 
student knowledge levels are considered to indicate potential for change while othera are not For 
example, a student who knows certain material is not likely to suddenly stop knowing it, but a student 
who has yet to learn some material is in a more changeable state. 

From the subgoal scores and other knowledge, such as curricular sequencing and prerequisite 
relationships, it is possible to define a set of subgoals that are most unstable. These are the subgoais 
that may require more frequent measurement in order for instruction to he steered well As discussed 
above, they represent the front along which instruction is progressing through the curricuium goal 
structure. The task of a test item generator, then, is to generate a test item that will be especially 
informative^about this front. MHO does this by posting a set of constraints for the test problem In the 
student model given above, the Series, Kirchhoffs Law, and Current subgoals are at this front ' Each 
constraint helps adapt the steering feedback to the student's current state. To see how this is done we 
need to consider MHO s architecture and the subject matter that it teaches and tests. ' 

Architecture 

At this time, MHO teaches and tests several levels of DC circuits. It poses problems such as the 
one shown in Figure 4. We call the architecture used in MHO the Bite-Slze Architecture It is an 
object-onented architecture for intelligent tutoring systems.a An object is a semiautonomous piece of 
computer prop-am that can be called upon to achieve particular goals. It includes both data structures 
and procedural capabilities. Object-oriented programming involves designing sets of objects that can 
eaiciently mteract to solve problems. Each curriculum subgoal (and also each metacurricular 
planning issue and eact' problem format) is represented by an object called a "bite " Within the 
computer prop-am, a bite contains a record of the student's performance on a subgoal and the 
knowledge needed to post a constraint for that subgoal. 

Voltage, for example. Is represented by a bite in MHO. That bite has rules for teaching about 
yoltage. It contains information pertinent to developing an understanding of what voltage represents 
mcludmg the constraints it should post to create relevant problems. Also, it can update the student ' 
model mformation by noting how the student does on problems relevant to its subgoal One byproduct 
Qt this architecture and the curricular model on which it is based is that a tutoring program's 
taiowledge is modular and can easily be expanded by adding additional curricular objects along with 
their pomjers to the other Imowledge components (which may involve additions to those components 
as well). For example, MHO s designers are now expanding It to include curricular goals InvolvinB 
simple alternating current circuits, • 

Problem Generation 

MHO poses problems by presenting a circuit diagram and asking a question about it The 
machinery used m problem generation chooses most of the circuit components randomly but it is 
constrained by botji general and specific curricular subgoals (bites) which the student has not yet 
mastered. Some of the choices represented by these constraints are the following: 

a. A problem can be posed in qualitative, quantitative or relative form. 

• problem can vary in the complexity of the arithmetic it requires and the complexity of 

the circuit (hapam to which it refers. This is determined by a global assessment of how much of the 
curriculum the student has mastered. 

c. The problem can require knowledge of Ohm's Law or either of Kirchhofrs Laws. 

d. The problem can focus on voltage, current or resistance. 

e. The problem can focus on series or parallel circuit topologies. (MHO also worries about 
where Uie m^ers are placed m circuit dlapams, since there are some placements that students have 
particular difficulty handlmg, but we ignore that matter to make presentation of the basic approach 
more straightforward). tt' 

The product of constraint posting Is stored as a list structure (see Footnote 3) to be used as the 
basia for problem generation and problem solving. This list structure contains information that 
specifies how to create a circmt and a problem based on that circuit, what the circuit should look like 
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and what electronic concepts are relevant. An example of such a list, derived from the student model 
shown in Table 1) is: 

[1] ((((Rel Simple) ($ Kirahhofp) ($ I -Series)) (UninterruptedS)) Series), 

This list represents the constraints that have been posted in sweeping the model shown in Table 
1 and is the starting point for automatic generation of a problem. Rel stands for a Relative problem 
that will pose a simpla question asking if two areas of the circuit will have the same measurement (in 
this case, current). Simple specifies the student's level of general understanding and will cause the 
circuit to be very simple in structure. Kirchhoff is the law this problem centers around, I ^Series is a 
specialization of KirchhoS's law, that current is equal at all points in a series array, UninterruptedS 
informs the problem generator that one meter should appear next to another with no other 
components between them (this is the simplest form for a problem looking at Kirchhoffs Law), 
Constrained by this information^ the problem generator can develop many different circuits and pose 
many different problems about them, so it is quite plausible to do as much steering testing as any 
student requires and also to give students sets of appropriate problems as homework. 

At the next, more elaborated, level of representation the circuit is designated as a network of 
resistors, a combination of series and parallel subnets with a power source. A more detailed list breaks 
this circuit into four nodes, each of which represents a side of a rectanfular circuit. The nodes are 
created separately and then put together to make up a circuit One at a time, the nodes are passed into 
a recursive function called MakeCirouitString to be elaborated further. MakeCireuitString makes 
decisions such as how many resistore are placed on a node, and whether these resistors should appear 
in a parallel or series net. These decisions are based on the information from the first list. 

Simple instructs MakeCireuitString to limit the number of resistors that appear and to 
oth erwise make the circuit conform to the specifications of a simple circuit. The Sintple speciAcations 
keep the coniponents that will be drawn to a minimum. Simple also informs MakeCireuitString that 
depending on what net we are working with all nodes should be of this kind. I ^Series specifies the 
net to be used: all sides are series arrays. If this were a Difficult problem, some sides might have 
parallel subnets and others series. An example of a simple circuity [1], that has passed through 
MakeCireuitString is 

[2] ((VoltageSourae) (Serim (ReBistor) (Resistor)) (Parailet (Resistor) (Resistor)) (Wire)). 
Figure 5 below shows the circuit designated by [2]. 

The final specifications development step is determining what problem should be posed about the 
circuit where meters should be placed and what question should be asked about them. This step 
requires some information from the first list, e.g. [1]. I~Serie& reveals whether current or voltage is 
the target concept, while UninterruptedS holds information pertaining to how many problems and 
where problems should appear. Several recursive functions tear apart the second list and insert 
problem information (mainly meters) where it is best suited. Using the above example and placing 
several meters into the list, one example of the next stage is 

[3] ((Problem Rel current after on (VoltageSource)) (Series (Resistor) (Problem Rel current before 
off(ResistQr)) (Parallel (Problem Rel current after on (Resistor) (Resistor)) (Wire)), 

This list is then passed to an intelligent problem developer, which composes and draws the 
circuit Figure 6 below shows a display corresponding to [3], The question posed to the student will 
end up being, 'Js the current at Meier A higher, lower, or the same as the current at Meter Bt" 

The Simulator assigns values to the components, i.e. resistance and voltage, and then finds thf ■ 
dependent values, i.e. current, voltage drops over resistors, etc. It can, for simple problems, ensure 
that all the values for current and voltage will be integral, and also can determine whether or not 
resistors and voltage sources should be displayed. If the circuit were more complex, an iterative 
propagation would occur next. Resistance for a subnet of a complex circuit, for example, would be 
calculated by asklnf each subnet component its resistance and then adding them together. Parallel 
structures are handled recursively as well, using the appropriate formulae^ 

The Softness of Student Classifiaations 

We conclude by reconsidering more broadly the issue of dia^ostic assessment of cognitive skills 
to steer instruction. Fundamentally, cognitive skill, like physical skill, often requires substantial 
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practice of its basic components in the contexts in which they are to be applied. Actions can be learned 
without learning the exact conditions for which they are appropriate. Newly learned, and 
consequently weak, knowledge can fail to be used because stronger but incorrect knowledie is 
overgeneralized from related situations. Processing capacity demands due to one subskill may be so 
^eat as to make the execution of another, newly formed subskill impossible. This means that for most 
of the course of learning, a fundamental principle is true: 

Om cannot be sure a subskill has not been learned Just because U 
was not demonstrated on an occasion where it should have been. 

On the other hand, cofnitive skill, like physical skill, is partly redundant. Weak methods can 
sometimes overcome the lack of appropriate domain knowledge. Sometimes, a problem that in theory 
should require a particular subskill is solved correctly by accident. The correct action may be taken 
with mcorrect knowledge of the conditions under which it is appropriate, or an incorrect action may 
turn out to be "safe" this time only. This leads to a second fundamental principle. 

One cannot be sure a subskill is completely learned just because it 
has been demonstrated. 

These two principles suggest that the steering approach to diagnostic testing, in which local 
microtesting is embedded in the curriculum to steer instruction, is a more valid approach than the 
broader diagnostic testing that has become part of many current monitoring programs in our ichools 
By askmg broad, generic questions (e,g., "What can I diagnose knowing nothing about the student in 
advance and giving only a general test?") we can get only broad, generic answers. That is, we can 
know how well, m general, learning is proceeding, but we can't steer specific children's education with 
such broad indicators, any more than we could steer a ship if all we had was an hourly account of how 
close to the correct path we were. 

Empirical experience and cognitive theory tell us that an inherent property of cognitive 
performance is that it is unreliable unless substantial practice has occurred and that success can come 
pr muUiple reasons. These factors ha veto be taken in to account in diagnosis. Ironically, perhaps the 
less reliable steering testing approach provides better steering capability than the highly refined ' 
approaches used in current psychometric efforts at diagnosis. But this is no different than the irony 
that continuously knowing approximately where you are affords better steering capability than 
occasionaily knowing how well you are steering, in general. 

The field of testing has worked to try to become efficient at making precise estimates from 
inherently unreliable data, and it has done very well at this. Approaches such as item^response theory 
and adaptive testmg have allowed the broad and vague measures that tests provide to be made ever 
more emciently. Further progress, and especially progress in steering testing (as opposed to 
certification and selection testing) will depend on better use of information we already have or can 
readily gjt, about the cognitive requirements of the performances and student competences relative to 
those performances that interest us. Like the physician, we will, in steering the course of a child's 
education, be better guided by sketchy data tied to specific theoretical analysis than by precise but 
general, indicators. , • ' 

Our approach can be contrasted to the steering forms used in the curricula that grew from Bob 
Glaser's work on individualized instruction. There, the steering idea was also used. However the 
technolo^ of the time did not permit more than a short, uniform mastery test after each lesson This 
allowed adequate teaching of the higher^aptitude student but did not handle the remediation problem 
discussed above. That is, it suffered from having to treat each curricular goal and its corresponding 
student capability as separable from every other, and it could not handle the problem of core learning 
wthout fringe transfer. There was much discussion during the period of that curriculum development 
about having remediation that was more than just doing the same thing again. The present approach 
to steering testing, which permits adaptation grounded in cognitive analysis of the instructional 
domain, rests on the goal structure for educational research established during the period of work by 
Bob Glaier and his collea^es on individually-prescribed instruction. 
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lln Lesgold (in press) a three-category model was presented. Since then, we have become 
convinced that the curriculum and treatment categories should be separated, 

SThls issue is addressed more completely in Lesgold (in press). 

3See Bonar, Cunningham and Schultz, 1986, for a description of An Object-Oriented Architecture 
for Intelligent Tutoring Sy&tems. MHO is implemented in Loops, Xeroxes proprietary object-oriented 
specialization of the standard artificial intelligence language Lisp, The graphics and student interface 
are handled via an interface package called Chips. Chips is a program developed at the Learning 
Research and Devalopment Canter, primarily by John 0, Corbett and Robert E. Cunningham, with 
some contribution by Andrew D, Bowen. The Chips tools allow circuit displays to be designed so the 
student can click the mouse (a mouse is a pointing device that causes a marker to move on the screen 
as the device is moved on a table top; it often contains buttons as well, so that the computer user can 
point to an object on the screen by moving the marker over that object and then pressing a button) on 
any of the components and thereby cause a menu of query options to appear. Each object can behave 
difFerently: when a student clicks on a meter, a question is asked; when he/she clicks on a resistor a 
Special menu of options ii presented. 
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Figure 2. Remedial Knowledge May Not Be Core Knowledge. 
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Figure 3= Ej^ampiei of Different Knowledges Needed for Steering Testing 
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Fjgur#4. E><ampl© Problem from MHO Teit Generator 
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Figure S. Circuit deicribed by iq. 2 
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Figures, Circuit described byiq. 3. 
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Table 1. ijtampit Student Model. 



MgtacurrUufir fisues 
Numericil Difficulty 
Circuit Complexity 



Simple vs. Difffcult 
iimple vi, complex 



Cumcutar SubgisaU 

Ohm's 

Kirchhoff*! 
Architecture 

Series 

Parallel 
CQncepts 

Current 

Reslftance 

Voltage 



Current Student itate 



Unlearned 
Perhapi 

Perhapi 
Unlearned 

Perhaps 

Unlearned 

Unleafned 



Treatment issues 

Probhrn Format 
Qualitative 
Quantitative 
Relative 
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